Representation Independent Data Analytics

نویسنده

  • Jose Picado
چکیده

Over the last years, users’ information needs over structured data expanded from seeking exact answers to precise queries, to performing database analytics tasks, such as finding entities or patterns similar to a given entity or pattern, discovering interesting patterns, or predicting novel relations and concepts. As part of its response, the research community proposed a multitude of algorithms to solve exploration and analytics problems over structured data. Since the properties of interesting and desirable answers are no longer precisely defined in the query, these algorithms use intuitively appealing heuristics to choose, from among all possible answers, those that are most likely to satisfy the user’s information need. Unfortunately, such heuristics typically depend on the precise choice of representation of the underlying database. Generally, there is no canonical representation for a particular set of content and people often represent the same information in different representations. Thus, in order to effectively use database analytics algorithms, users generally have to restructure their databases to some proper representation or change hyper-parameter settings. As a result, today’s database exploration and analytics algorithms and tools are usable only by highly trained data scientists who can predict which algorithms are likely to be effective for particular representations of the underlying database, and under which settings. To cope with the structural heterogeneity in large-scale data, we propose a novel approach to database analytics that considers representation as a first-class citizen. We introduce the concept of representation independence as the ability to deliver the same answers regardless of the choices of structure for organizing the data. Because representation independence may not always be achievable, we also consider representation scalability, the ability to return similar answers over different structures of data. We discuss our work on providing ordinary users with an arsenal of effective database analytics methods that are robust across multiple representations of the same information. We present our ongoing work on creating representation independent analytics

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Representation Independent Analytics Over Structured Data

Database analytics algorithms leverage quantifiable structural properties of the data to predict interesting concepts and relationships. The same information, however, can be represented using many different structures and the structural properties observed over particular representations do not necessarily hold for alternative structures. Thus, there is no guarantee that current database analy...

متن کامل

P-V-L Deep: A Big Data Analytics Solution for Now-casting in Monetary Policy

The development of new technologies has confronted the entire domain of science and industry with issues of big data's scalability as well as its integration with the purpose of forecasting analytics in its life cycle. In predictive analytics, the forecast of near-future and recent past - or in other words, the now-casting - is the continuous study of real-time events and constantly updated whe...

متن کامل

Big Data Analytics and Now-casting: A Comprehensive Model for Eventuality of Forecasting and Predictive Policies of Policy-making Institutions

The ability of now-casting and eventuality is the most crucial and vital achievement of big data analytics in the area of policy-making. To recognize the trends and to render a real image of the current condition and alarming immediate indicators, the significance and the specific positions of big data in policy-making are undeniable. Moreover, the requirement for policy-making institutions to ...

متن کامل

A Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection

Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change. The popularity of big data analytics platforms, which are often available as open-source, has not remained unnoticed by big companies. Google uses MapReduce for PageRank and inverted indexes....

متن کامل

Application of Big Data Analytics in Power Distribution Network

Smart grid enhances optimization in generation, distribution and consumption of the electricity by integrating information and communication technologies into the grid. Today, utilities are moving towards smart grid applications, most common one being deployment of smart meters in advanced metering infrastructure, and the first technical challenge they face is the huge volume of data generated ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017